Use custom iterators to speed up `inflow_ids` and friends #830

visr · 2023-11-26T20:29:14Z

@Huite mentioned the performance of medium-sized models regressed recently, so I looked into it a bit, doing some profiling. I saw a lot of time was spent on finding the in- and outneighbors of flow. This changed quite a bit in #807, and whilst the API is a lot nicer, unfortunately there was also a performance regression. The profile showed allocations. This PR removes those allocations, leading to a 2x speedup for a test model, from 16s to 8s. Pre-#807 this ran in 3s however, so there is still a considerable gap to close. I don't really want to revert #807, though perhaps for flow connections we need another data structure, or some other way to speed up.

This PR changes from using a filtered array comprehension, which creates a Vector{NodeID}, to a custom iterator InNeighbors which iterates over the filtered inneighbors. This avoids allocations but otherwise returns the same result.

SouthEndMusic · 2023-11-27T08:00:33Z

@visr nice that you picked this up. Regarding the gap to close: this might be due to the dictionary to map from an edge to an index in the dense flow vector. The SparseMatrixCSC we used before does not use a dictionary internally, so I suspect we can learn something from that implementation.

SouthEndMusic · 2023-11-27T08:11:18Z

Also, get_tmp is called much more often now but I thought that is basically free

Hofer-Julian · 2023-11-27T08:38:59Z

core/src/allocation.jl

@@ -13,8 +13,8 @@ function allocation_graph_used_nodes!(p::Parameters, allocation_network_id::Int)
        node_type = graph[node_id].type
        if node_type in [:user, :basin]
            push!(used_nodes, node_id)
-
-        elseif length(inoutflow_ids(graph, node_id)) > 2
+        elseif count(x -> true, inoutflow_ids(graph, node_id)) > 2


Why is that preferable? I've assumed that length simply consumes unsized iterators, just like your code using count.

No it's actually not defined, a MethodError.

visr · 2023-11-27T10:04:40Z

this might be due to the dictionary to map from an edge to an index in the dense flow vector.

This didn't show up in the profile so I think that's fast enough. Though the dict lookup to check if an edge is a flow edge did show up. This was for a model with only flow edges. get_tmp also didn't show up.

@SouthEndMusic is this ok to merge for you? I think this fixes the issue enough to make a release, we can look after into more speedups.

SouthEndMusic · 2023-11-27T10:41:21Z

@visr disregard my comments above, those are about #828 which I am working on at the moment.

If this works and indeed gives the required speedup it is fine by me, I would like you to walk me trough how it works this afternoon if you have time

SouthEndMusic

LGTM

Use custom iterators to speed up inflow_ids and friends

571291e

Hofer-Julian reviewed Nov 27, 2023

View reviewed changes

visr requested a review from SouthEndMusic November 27, 2023 10:04

SouthEndMusic reviewed Nov 27, 2023

View reviewed changes

Merge branch 'main' into iterate

cd6f232

visr merged commit b8d564b into main Nov 27, 2023
15 checks passed

visr deleted the iterate branch November 27, 2023 12:53

SouthEndMusic mentioned this pull request Nov 29, 2023

Meta-issue: Use MetaGraphsNext.jl for graphs with node and edge IDs #732

Closed

visr mentioned this pull request Dec 12, 2023

Pre-calculate flow neighbor IDs per node #889

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use custom iterators to speed up `inflow_ids` and friends #830

Use custom iterators to speed up `inflow_ids` and friends #830

visr commented Nov 26, 2023

SouthEndMusic commented Nov 27, 2023

SouthEndMusic commented Nov 27, 2023

Hofer-Julian Nov 27, 2023

visr Nov 27, 2023

Hofer-Julian Nov 27, 2023

visr commented Nov 27, 2023

SouthEndMusic commented Nov 27, 2023

SouthEndMusic left a comment

Use custom iterators to speed up inflow_ids and friends #830

Use custom iterators to speed up inflow_ids and friends #830

Conversation

visr commented Nov 26, 2023

SouthEndMusic commented Nov 27, 2023

SouthEndMusic commented Nov 27, 2023

Hofer-Julian Nov 27, 2023

Choose a reason for hiding this comment

visr Nov 27, 2023

Choose a reason for hiding this comment

Hofer-Julian Nov 27, 2023

Choose a reason for hiding this comment

visr commented Nov 27, 2023

SouthEndMusic commented Nov 27, 2023

SouthEndMusic left a comment

Choose a reason for hiding this comment

Use custom iterators to speed up `inflow_ids` and friends #830

Use custom iterators to speed up `inflow_ids` and friends #830